From Regular Expressions to Deterministic Automata
نویسندگان
چکیده
The main theorem allows an elegant algorithm to be refined into an efficient one. The elegant algorithm for constructing a finite automaton from a regular expression is based on 'derivatives of' regular expressions; the efficient algorithm is based on 'marking of' regular expressions. Derivatives of regular expressions correspond to state transitions in finite automata. When a finite automaton makes a transition under input symbol a, a leading a is stripped from the remaining input. Correspondingly, if the input string is generated by a regular expression E, then the derivative of E by a generates the remaining input after a leading a is stripped. Brzozowski (1964) used derivatives to construct finite automata; the state for expression E has a transition under a to the state for the derivative of E by a. This approach extends to regular expressions with new operators, including intersection and complement; however, explicit computation of derivatives can be expensive. Marking of regular'expressions yields an expression with distinct input symbols. Following MeNaughton and Yamada (1960), we attach subscripts to each input symbol in an expression; (ab+b)*ba becomes (atb2+b3)*b4as. Conceptually, the efficient algorithm constructs an automaton for the marked expression. The marks on the transitions are then erased, resulting in a nondeterministic automaton for the original unmarked expression. This approach works for the usual operations of union, concatenation, and iteration; however, intersection and complement cannot be handled because marking and unmarking do not preserve the languages generated by regular expressions with these operators.
منابع مشابه
A novel algorithm for the conversion of shuffle regular expressions into non-deterministic finite automata
Regular expressions with shuffle operators are widely used in diverse fields of computer science. The work presented here investigates the shuffling of regular expressions and their conversion into non-deterministic finite automata. The aim of the paper is to design a novel algorithm for constructing -free non-deterministic finite automata from the shuffling of regular expressions. Non-determ...
متن کاملBlock-Deterministic Regular Languages
We introduce the notions of blocked, block-marked and block-deterministic regular expressions. We characterize block-deterministic regular expressions with determin-istic Glushkov block automata. The results can be viewed as a generalization of the characterization of one-unambiguous regular expressions with deterministic Glushkov automata. In addition, when a language L has a block-determinist...
متن کاملModel Checking Regular Language Constraints
Even the fastest SMT solvers have performance problems with regular expressions from real programs. Because these performance issues often arise from the problem representation (e.g. non-deterministic finite automata get determinized and regular expressions get unrolled), we revisit Boolean finite automata, which allow for the direct and natural representation of any Boolean combination of regu...
متن کاملLearning Regular Languages via Alternating Automata
Nearly all algorithms for learning an unknown regular language, in particular the popular L∗ algorithm, yield deterministic finite automata. It was recently shown that the ideas of L∗ can be extended to yield non-deterministic automata, and that the respective learning algorithm, NL∗, outperforms L∗ on randomly generated regular expressions. We conjectured that this is due to the existential na...
متن کاملA Bialgebraic Review of Regular Expressions, Deterministic Automata and Languages
This papers reviews the classical theory of deterministic automata and regular languages from a categorical perspective. The basis is formed by Rutten's description of the Brzozowski automaton structure in a coalgebraic framework. We enlarge the framework to a so-called bialgebraic one, by including algebras together with suitable distributive laws connecting the algebraic and coalgebraic struc...
متن کاملA Novel Algorithm for the Conversion of Parallel Regular Expressions to Non-deterministic Finite Automata
The aim of the paper is to concoct a novel algorithm for the metamorphosis of parallel regular expressions to ε-free nondeterministic finite automata. For a given parallel regular expression r, let m be the number of symbols that occur in r and let C denote the number of concatenation operators in r. In the worst case, 2m+1 states are required for the construction of the non-deterministic finit...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Theor. Comput. Sci.
دوره 48 شماره
صفحات -
تاریخ انتشار 1986